AITopics | wilcoxon signed-rank test

Collaborating Authors

wilcoxon signed-rank test

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Appendix for " CaMiT: ATime-Aware Car Model Dataset for Classification and Generation "

Neural Information Processing SystemsJun-18-2026, 21:43:21 GMT

Filtering Step Remaining Instances Raw images collected 7.5Mimages After deduplication and car detection 4.9M images Initial car bounding boxes 13.22M boxes After score/size thresholding 6.97M boxes After Qwen2.5-7B

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: Europe (0.46)

Genre:

Research Report > Experimental Study (0.69)
Research Report > New Finding (0.69)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)
Automobiles & Trucks > Manufacturer (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)
(2 more...)

Add feedback

Sharing but Not Caring: Similar Outcomes for Shared Control and Switching Control in Telepresence-Robot Navigation

Kalliokoski, Juho, Center, Evan G., LaValle, Steven M., Ojala, Timo, Sakcak, Basak

arXiv.org Artificial IntelligenceSep-9-2025

Abstract-- T elepresence robots enable users to interact with remote environments, but efficient and intuitive navigation remains a challenge. In this work, we developed and evaluated a shared control method, in which the robot navigates autonomously while allowing users to affect the path generation to better suit their needs. We compared this with control switching, where users toggle between direct and automated control. We hypothesized that shared control would maintain efficiency comparable to control switching while potentially reducing user workload. The results of two consecutive user studies (each with final sample of n = 20) showed that shared control does not degrade navigation efficiency, but did not show a significant reduction in task load compared to control switching. Further research is needed to explore the underlying factors that influence user preference and performance in these control systems. Telepresence robots represent a class of robotic systems in which a mobile robot is equipped with a camera streaming live video to a display that is watched by a remote user. These robots have already found a wide domain of applications ranging from business meetings and factory tours to personal events such as graduations and weddings. Furthermore, immersive-telepresence robots allow a panoramic camera stream to be watched by a remote user via a head-mounted display (HMD). These systems carry great potential in improving the user interaction with the remote environment, which is found to be lacking in the commercial systems [1].

artificial intelligence, human computer interaction, robot, (16 more...)

arXiv.org Artificial Intelligence

2509.05672

Country: Europe > Finland (0.29)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)
Research Report > Experimental Study > Negative Result (0.31)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.50)
Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (0.46)

Add feedback

Systematic Evaluation of Attribution Methods: Eliminating Threshold Bias and Revealing Method-Dependent Performance Patterns

Aksoy, Serra

arXiv.org Artificial IntelligenceSep-4-2025

Attribution methods explain neural network predictions by identifying influential input features, but their evaluation suffers from threshold selection bias that can reverse method rankings and undermine conclusions. Current protocols binarize attribution maps at single thresholds, where threshold choice alone can alter rankings by over 200 percentage points. We address this flaw with a threshold-free framework that computes Area Under the Curve for Intersection over Union (AUC-IoU), capturing attribution quality across the full threshold spectrum. Evaluating seven attribution methods on dermatological imaging, we show single-threshold metrics yield contradictory results, while threshold-free evaluation provides reliable differentiation. XRAI achieves 31% improvement over LIME and 204% over vanilla Integrated Gradients, with size-stratified analysis revealing performance variations up to 269% across lesion scales. These findings establish methodological standards that eliminate evaluation artifacts and enable evidence-based method selection. The threshold-free framework provides both theoretical insight into attribution behavior and practical guidance for robust comparison in medical imaging and beyond.

artificial intelligence, evaluation, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2509.03176

Country: Europe > Germany (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)

Add feedback

A novel language model for predicting serious adverse event results in clinical trials from their prospective registrations

Hu, Qixuan, Zhang, Xumou, Kim, Jinman, Bourgeois, Florence, Dunn, Adam G.

arXiv.org Artificial IntelligenceAug-12-2025

Objectives: With accurate estimates of expected safety results, clinical trials could be better designed and monitored. We evaluated methods for predicting serious adverse event (SAE) results in clinical trials using information only from their registrations prior to the trial. Material and Methods: We analyzed 22,107 two-arm parallel interventional clinical trials from ClinicalTrials.gov with structured summary results. Two prediction models were developed: a classifier predicting whether a greater proportion of participants in an experimental arm would have SAEs (area under the receiver operating characteristic curve; AUC) compared to the control arm, and a regression model to predict the proportion of participants with SAEs in the control arms (root mean squared error; RMSE). A transfer learning approach using pretrained language models (e.g., ClinicalT5, BioBERT) was used for feature extraction, combined with a downstream model for prediction. To maintain semantic representation in long trial texts exceeding localized language model input limits, a sliding window method was developed for embedding extraction. Results: The best model (ClinicalT5+Transformer+MLP) had 77.6% AUC when predicting which trial arm had a higher proportion of SAEs. When predicting SAE proportion in the control arm, the same model achieved RMSE of 18.6%. The sliding window approach consistently outperformed direct comparisons. Across 12 classifiers, the average absolute AUC increase was 2.00%, and absolute RMSE reduction was 1.58% across 12 regressors. Discussion: Summary results data from ClinicalTrials.gov remains underutilized. Predicted results of publicly reported trials provides an opportunity to identify discrepancies between expected and reported safety results.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2507.22919

Country: North America > United States (1.00)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)

Add feedback

Microelectrode Signal Dynamics as Biomarkers of Subthalamic Nucleus Entry on Deep Brain Stimulation: A Nonlinear Feature Approach

Tavares, Ana Luiza S., Neto, Artur Pedro M., Gomes, Francinaldo L., Reis, Paul Rodrigo dos, da Silva, Arthur G., Junior, Antonio P., Gomes, Bruno D.

arXiv.org Artificial IntelligenceJul-1-2025

Accurate intraoperative localization of the subthalamic nucleus (STN) is essential for the efficacy of Deep Brain Stimulation (DBS) in patients with Parkinson's disease. While microelectrode recordings (MERs) provide rich electrophysiological information during DBS electrode implantation, current localization practices often rely on subjective interpretation of signal features. In this study, we propose a quantitative framework that leverages nonlinear dynamics and entropy-based metrics to classify neural activity recorded inside versus outside the STN. MER data from three patients were preprocessed using a robust artifact correction pipeline, segmented, and labelled based on surgical annotations. A comprehensive set of recurrence quantification analysis, nonlinear, and entropy features were extracted from each segment. Multiple supervised classifiers were trained on every combination of feature domains using stratified 10-fold cross-validation, followed by statistical comparison using paired Wilcoxon signed-rank tests with Holm-Bonferroni correction. The combination of entropy and nonlinear features yielded the highest discriminative power, and the Extra Trees classifier emerged as the best model with a cross-validated F1-score of 0.902+/-0.027 and ROC AUC of 0.887+/-0.055. Final evaluation on a 20% hold-out test set confirmed robust generalization (F1= 0.922, ROC AUC = 0.941). These results highlight the potential of nonlinear and entropy signal descriptors in supporting real-time, data-driven decision-making during DBS surgeries

artificial intelligence, classifier, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2506.22454

Country: South America > Brazil (0.15)

Genre: Research Report > New Finding (0.88)

Industry: Health & Medicine > Therapeutic Area > Neurology > Parkinson's Disease (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Federated EndoViT: Pretraining Vision Transformers via Federated Learning on Endoscopic Image Collections

Kirchner, Max, Jenke, Alexander C., Bodenstedt, Sebastian, Kolbinger, Fiona R., Saldanha, Oliver L., Kather, Jakob N., Wagner, Martin, Speidel, Stefanie

arXiv.org Artificial IntelligenceMay-9-2025

Purpose: In this study, we investigate the training of foundation models using federated learning to address data-sharing limitations and enable collaborative model training without data transfer for minimally invasive surgery. Methods: Inspired by the EndoViT study, we adapt the Masked Autoencoder for federated learning, enhancing it with adaptive Sharpness-Aware Minimization (FedSAM) and Stochastic Weight Averaging (SWA). Our model is pretrained on the Endo700k dataset collection and later fine-tuned and evaluated for tasks such as Semantic Segmentation, Action Triplet Recognition, and Surgical Phase Recognition. Results: Our findings demonstrate that integrating adaptive FedSAM into the federated MAE approach improves pretraining, leading to a reduction in reconstruction loss per patch. The application of FL-EndoViT in surgical downstream tasks results in performance comparable to CEN-EndoViT. Furthermore, FL-EndoViT exhibits advantages over CEN-EndoViT in surgical scene segmentation when data is limited and in action triplet recognition when large datasets are used. Conclusion: These findings highlight the potential of federated learning for privacy-preserving training of surgical foundation models, offering a robust and generalizable solution for surgical data science. Effective collaboration requires adapting federated learning methods, such as the integration of FedSAM, which can accommodate the inherent data heterogeneity across institutions. In future, exploring FL in video-based models may enhance these capabilities by incorporating spatiotemporal dynamics crucial for real-world surgical environments.

artificial intelligence, machine learning, recognition, (18 more...)

arXiv.org Artificial Intelligence

2504.16612

Country:

Europe > Germany (0.29)
North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Surgery (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Information Technology > Security & Privacy (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)

Add feedback

Bringing Order Amidst Chaos: On the Role of Artificial Intelligence in Secure Software Engineering

Esposito, Matteo

arXiv.org Artificial IntelligenceJan-9-2025

Context. Developing secure and reliable software remains a key challenge in software engineering (SE). The ever-evolving technological landscape offers both opportunities and threats, creating a dynamic space where chaos and order compete. Secure software engineering (SSE) must continuously address vulnerabilities that endanger software systems and carry broader socio-economic risks, such as compromising critical national infrastructure and causing significant financial losses. Researchers and practitioners have explored methodologies like Static Application Security Testing Tools (SASTTs) and artificial intelligence (AI) approaches, including machine learning (ML) and large language models (LLMs), to detect and mitigate these vulnerabilities. Each method has unique strengths and limitations. Aim. This thesis seeks to bring order to the chaos in SSE by addressing domain-specific differences that impact AI accuracy. Methodology. The research employs a mix of empirical strategies, such as evaluating effort-aware metrics, analyzing SASTTs, conducting method-level analysis, and leveraging evidence-based techniques like systematic dataset reviews. These approaches help characterize vulnerability prediction datasets. Results. Key findings include limitations in static analysis tools for identifying vulnerabilities, gaps in SASTT coverage of vulnerability types, weak relationships among vulnerability severity scores, improved defect prediction accuracy using just-in-time modeling, and threats posed by untouched methods. Conclusions. This thesis highlights the complexity of SSE and the importance of contextual knowledge in improving AI-driven vulnerability and defect prediction. The comprehensive analysis advances effective prediction models, benefiting both researchers and practitioners.

baseline versus isolation prediction, learning-based vulnerability detection, process labelling dimension assessed, (16 more...)

arXiv.org Artificial Intelligence

2501.05165

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Washington > King County > Seattle (0.13)
North America > Canada > Quebec > Montreal (0.04)
(55 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)
Research Report > Promising Solution (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Education (1.00)
(3 more...)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Software Engineering (1.00)
Information Technology > Data Science > Data Mining (1.00)
(6 more...)

Add feedback

A transparency-based action model implemented in a robotic physical trainer for improved HRI

Naama, Aharony, Maya, Krakovski, Yael, Edan

arXiv.org Artificial IntelligenceJun-16-2024

Transparency is an important aspect of human-robot interaction (HRI), as it can improve system trust and usability leading to improved communication and performance. However, most transparency models focus only on the amount of information given to users. In this paper, we propose a bidirectional transparency model, termed a transparency-based action (TBA) model, which in addition to providing transparency information (robot-to-human), allows the robot to take actions based on transparency information received from the human (robot-of-human and human-to-robot). We implemented a three-level (High, Medium and Low) TBA model on a robotic system trainer in two pilot studies (with students as participants) to examine its impact on acceptance and HRI. Based on the pilot studies results, the Medium TBA level was not included in the main experiment, which was conducted with older adults (aged 75-85). In that experiment, two TBA levels were compared: Low (basic information including only robot-to-human transparency) and High (including additional information relating to predicted outcomes with robot-of-human and human-to-robot transparency). The results revealed a significant difference between the two TBA levels of the model in terms of perceived usefulness, ease of use, and attitude. The High TBA level resulted in improved user acceptance and was preferred by the users.

information, robot, transparency, (16 more...)

arXiv.org Artificial Intelligence

2406.10848

Country:

Asia > Middle East > Israel > Southern District > Beer-Sheva (0.04)
North America > United States > Texas > Bexar County > San Antonio (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (0.70)
Research Report > New Finding (0.69)

Industry:

Education (1.00)
Health & Medicine > Consumer Health (0.94)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Add feedback

Hunting imaging biomarkers in pulmonary fibrosis: Benchmarks of the AIIB23 challenge

Nan, Yang, Xing, Xiaodan, Wang, Shiyi, Tang, Zeyu, Felder, Federico N, Zhang, Sheng, Ledda, Roberta Eufrasia, Ding, Xiaoliu, Yu, Ruiqi, Liu, Weiping, Shi, Feng, Sun, Tianyang, Cao, Zehong, Zhang, Minghui, Gu, Yun, Zhang, Hanxiao, Gao, Jian, Tang, Wen, Yu, Pengxin, Kang, Han, Chen, Junqiang, Lu, Xing, Zhang, Boyu, Mamalakis, Michail, Prinzi, Francesco, Carlini, Gianluca, Cuneo, Lisa, Banerjee, Abhirup, Xing, Zhaohu, Zhu, Lei, Mesbah, Zacharia, Jain, Dhruv, Mayet, Tsiry, Yuan, Hongyu, Lyu, Qing, Wells, Athol, Walsh, Simon LF, Yang, Guang

arXiv.org Artificial IntelligenceDec-21-2023

Airway-related quantitative imaging biomarkers are crucial for examination, diagnosis, and prognosis in pulmonary diseases. However, the manual delineation of airway trees remains prohibitively time-consuming. While significant efforts have been made towards enhancing airway modelling, current public-available datasets concentrate on lung diseases with moderate morphological variations. The intricate honeycombing patterns present in the lung tissues of fibrotic lung disease patients exacerbate the challenges, often leading to various prediction errors. To address this issue, the 'Airway-Informed Quantitative CT Imaging Biomarker for Fibrotic Lung Disease 2023' (AIIB23) competition was organized in conjunction with the official 2023 International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI). The airway structures were meticulously annotated by three experienced radiologists. Competitors were encouraged to develop automatic airway segmentation models with high robustness and generalization abilities, followed by exploring the most correlated QIB of mortality prediction. A training set of 120 high-resolution computerised tomography (HRCT) scans were publicly released with expert annotations and mortality status. The online validation set incorporated 52 HRCT scans from patients with fibrotic lung disease and the offline test set included 140 cases from fibrosis and COVID-19 patients. The results have shown that the capacity of extracting airway trees from patients with fibrotic lung disease could be enhanced by introducing voxel-wise weighted general union loss and continuity loss. In addition to the competitive image biomarkers for prognosis, a strong airway-derived biomarker (Hazard ratio>1.5, p<0.0001) was revealed for survival prognostication compared with existing clinical measurements, clinician assessment and AI-based biomarkers.

airway segmentation, prediction, segmentation, (14 more...)

arXiv.org Artificial Intelligence

2312.13752

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > China > Shanghai > Shanghai (0.04)
(10 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.46)

Industry: Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Measuring Adversarial Datasets

Bai, Yuanchen, Huang, Raoyi, Viswanathan, Vijay, Kuo, Tzu-Sheng, Wu, Tongshuang

arXiv.org Artificial IntelligenceNov-6-2023

In the era of widespread public use of AI systems across various domains, ensuring adversarial robustness has become increasingly vital to maintain safety and prevent undesirable errors. Researchers have curated various adversarial datasets (through perturbations) for capturing model deficiencies that cannot be revealed in standard benchmark datasets. However, little is known about how these adversarial examples differ from the original data points, and there is still no methodology to measure the intended and unintended consequences of those adversarial transformations. In this research, we conducted a systematic survey of existing quantifiable metrics that describe text instances in NLP tasks, among dimensions of difficulty, diversity, and disagreement. We selected several current adversarial effect datasets and compared the distributions between the original and their adversarial counterparts. The results provide valuable insights into what makes these datasets more challenging from a metrics perspective and whether they align with underlying assumptions.

dataset, information, metric, (15 more...)

arXiv.org Artificial Intelligence

2311.03566

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
(4 more...)

Genre:

Research Report (1.00)
Overview (0.88)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback